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Abstract 

Network coding achieves optimal throughput in multicast networks. However, throughput opti- 
mality relies on the network nodes or routers to code correctly. A Byzantine node may introduce 
junk packets in the network (thus polluting downstream packets and causing the sinks to receive the 
wrong data) or may choose coding coefficients in a way that significantly reduces the throughput of 
the network. 

Most prior work focused on the problem of Byzantine nodes polluting packets. However, even if a 
Byzantine node does not pollute packets, he can still affect significantly the throughput of the network 
by not coding correctly. No previous work attempted to verify if a certain node coded correctly using 
random coefficients over all of the packets he was supposed to code over. 

We provide two novel protocols (which we call PIP and Log-PIP) for detecting whether a node coded 
correctly over all the packets received (i.e., according to a random linear network coding algorithm). 
Our protocols enable any node in the network to examine a packet received from another node by 
running a "verification test". With our protocols, the worst an adversary can do and still pass the 
packet verification test is in fact equivalent to random linear network coding, which has been shown 
to be optimal in multicast networks. Our protocols resist collusion among nodes and are applicable to 
a variety of settings. 

Our topology simulations show that the throughput in the worst case for our protocol is two to 
three times larger than the throughput in various adversarial strategies allowed by prior work. We 
implemented our protocols in C/CH — h and Java, as well as incorporated them on the Android platform 
(Nexus One). Our evaluation shows that our protocols impose modest overhead. 
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1 Introduction 



Network coding was first proposed by Ahlswede et al. [ACLYOO], who demonstrated that, for certain 
networks, network coding can produce a higher throughput than the best routing strategy. A subsequent 
line of work that includes the works of Koetter et al. [KM03], Li et al. [LYC03], and Jaggi et al. [JSC + 05] 
showed that random linear coding reaches maximum throughput for multicast networks. Overall, network 
coding has proved better than routing for both wired and wireless networks and for both multicast and 
broadcast [NS08]; it has also found applications to increasing the robustness and throughput of peer-to- 
peer networks (e.g., [GR05]) and to a variety of sensor wireless networks as surveyed by Narmawala and 
Srivastava [NS08]. 

Throughput optimality requires diversity. The throughput guarantees of network coding, however, 
rely on the assumption that all the nodes in the network code correctly, i.e., each node in the network, when 
receiving packets, is assumed to transmit a packet that is a random linear combination of the incoming 
packets; informally, packets that are indeed linear combinations of the incoming packets are said to be 
valid, and packets that are random linear combinations of the incoming packets are said to be diverse. 

The assumption that each node in the network codes correctly may not hold because the network may 
contain Byzantine nodes, who are malicious or faulty nodes. For example, a Byzantine node may change 
the payload or the coding vector in a way that is not a linear combination of the received packets, thereby 
transmitting an invalid (or polluted) packet. The invalid packet will mix with other packets and thus 
pollute more packets, ultimately causing the decoded information at the sinks to be incorrect. 

In fact, a Byzantine node can transmit a valid packet (i.e., a linear combination of the received packets), 
but still manage to decrease the overall throughput at the sinks. The Byzantine node could choose 
coefficients for the linear combination in a way that is not random: the node could forward one of the 
packets (by simply routing), code over only a subset of the packets, or, even worse, choose coefficients 
that do not contribute any new information to his receivers, thus, effectively sending nothing. While the 
network is not polluted by such a Byzantine node (and the decoded information at the sinks is still valid) , 
the throughput of the network is decreased. In Section 6, as an example, we show that such Byzantine 
nodes can indeed reduce the throughput to as much as a half or a third in some specific cases and as 
much as 20% on random topologies. Figure 1 shows a simple example of 50% throughput reduction on 
the standard butterfly topology. 




(a) (b) 



Figure 1: Example of throughput reduction caused by a Byzantine node on a butterfly 
network: (a) if node Ni is honest, he will send A + B, thus allowing both N2 and N3 to 
recover both A and B; (b) if node N x is Byzantine, he may choose to send only A, thus 
halving the throughput at N2, which can now recover only A. 

Insufficiency of prior work to guarantee correctness. A significant body of previous work that 
includes [KFM04], [CJL06], [GR06], [GR06], [ZKMH07], [YWRG08] , [HLK+08], [JLK+08], [BFKW09], 
[KTT09], [AB09], [DCNR09], [ABBF10], [LM10], [YSJL10], and [WVNKIO] addressed the problem of 
defending against pollution attacks, where the goal is to enforce or check that the packets sent by each 
node to be some (not necessarily random) linear combination of the packets sent by the source. Most prior 
work on enforcing validity of packets has focused on detecting polluted packets right at the point where 
a Byzantine node injected them into the network [BFKW09], [ZKMH07], [KFM04], [YWRG08] , [CJL06], 
and [ABBF10]: when a Byzantine node injects an invalid packet into the network, the node receiving the 
packet is able to detect if the packet is invalid by running a test, and can discard the invalid packet right 
away. 
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However, all such work does not detect Byzantine nodes that deviate from random linear coding of the 
received packets, thus allowing such Byzantine nodes to reduce throughput as already discussed above. 
In particular, Byzantine nodes are still allowed to simply forward a received packet (rather than to code 
over multiple packets) or use coefficients that provide no new degrees of freedom to downstream nodes, 
effectively sending no data. 

Our result. Given that Byzantine nodes may significantly affect the throughput of the network, we 
believe that it is important to study the following problem: 

How to force a node to code correctly? 
(That is, to code both validly and randomly, over all the received packets.) 

Our main contribution is a novel protocol for enabling each child of a node to detect whether the node 
coded correctly over all the packets he was supposed to code over (i.e., according to a random linear 
network coding algorithm) . In our protocol, a child node C of a node N (where child means that C receives 
data from his parent N) can check, by running a verification test, that the data from N is the result of 
correctly coding over the packets N receives from his parents. The node C need only examine the packet 
received from N and does not need to know the precise packet payloads used in coding at N. 

Let the required set of N, denoted IZ^, be the subset of the parents of N that N is expected to code 
over. As we will discuss in Section 5, the exact definition of the required set depends on the application; 
the flexibility in defining it will enable our protocols to be applicable to a variety of settings. For example, 
some applications may require a node to code over the packets from all his parents; other applications, 
perhaps due to unreliability of the communication channel, may require nodes to code over at least some 
minimum number of parents. 

Using our protocols presented in Section 4, the child node C can ensure that: 

(i) the packet from N is the result of coding over the packets from all the nodes in IZ^ , and 

(ii) the coding coefficients used by N are pseudorandom. 

We provide two algorithms, with two different kinds of guarantees: Payload-Independent-Protocol (PIP) 
and Log- Verification PIP (Log-PIP). PIP always detects if N failed to code over all the packets from parents 
in the required set, whereas Log-PIP detects such a violation with an adjustable probability. In cases where 
nodes can have many parents (say, more than 10), Log-PIP is faster and more bandwidth efficient. While 
we use pseudorandom coefficients instead of random ones, this does not affect the throughput guarantees 
of network coding (see Section 4.5); accordingly, we will use the two terms interchangeably in this paper. 

Furthermore, our protocols are resistant to collusion among nodes: even if the two Byzantine nodes 
N and C collude, the other honest children of N can still check whether N coded correctly over any non- 
colluding parents. 

Finally, we assume that there exist penalties for nodes that are found to send incorrect packets, and we 
assume that they drive incentives against cheating in a detectable manner. A discussion of the exact form 
of such penalties of course lies outside of the scope of this paper, and one should choose the penalty that is 
best fit for one's application. To facilitate the use of a penalty system, though, our protocol enables nodes 
to prove (and not only detect) that a parent cheated (i.e., did not code correctly); moreover, Byzantine 
nodes cannot falsely accuse honest nodes of not coding correctly. 

Thus, we assume that Byzantine nodes will not cheat in a detectable way. We therefore consider an 
adversarial model in which Byzantine nodes perform the worst possible action to pass the verification test. 
In Section 4.8, we prove that the worst an adversary can do and still pass our packet verification tests is to 
code correctly (i.e., according to a random linear network coding scheme), which has been shown to give 
optimal throughput in multicast networks. 

Implementation and evaluation. Our simulations in Section 6 show that the throughput in the best 
adversarial strategy for our protocol is two to three times larger than the throughput in several adversarial 
strategies allowed by prior work. 

We implemented our protocols in C/C++. We also wrote a Java implementation for Java-based P2P 
applications and an Android package for smartphone P2P file sharing. Our C/C++ evaluations show that 
the protocols are reasonably efficient: the running time at a node to prepare for transmitting the data is 
less than 0.3 ms, and the time to perform a verification test is 3.7 ms with PIP and 1.4 ms with Log-PIP. 
Compared to the overhead introduced by a pollution detection scheme that we analyzed [BFKW09], the 
additional overheads introduced by our two protocols are respectively less than 2% for PIP and less than 
0.5% for Log-PIP. This suggests that, if one is already using a pollution detection scheme, then additionally 
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enforcing diversity of packets will not affect performance by much. Moreover, the overhead of both of our 
protocols is independent of how large the packet pay load is. 

2 Related Work 

Ahlswedc et al. [ACLYOO] have pioneered the field of network coding. They showed the value of coding 
at routers and provided theoretical bounds on the capacity of such networks. Works such as those of 
Koetter et al. [KM03], Li et al. [LYC03], and Jaggi et al. [JSC+05] show that, for multicast traffic, linear 
codes achieve maximum throughput, while coding and decoding can be done in polynomial time. Ho et 
al. [HKM+03] show that random network coding can also achieve maximum network capacity. Network 
coding has been shown to improve throughput in a variety of networks: wireless [LMK05], peer-to-peer 
content distribution [GR05], energy [WNEOO], distributed storage [Jia06], and others. 

Despite its throughput benefits, however, network coding is susceptible to Byzantine attacks. A Byzan- 
tine node can inject into the network junk packets, which will mix with correct packets and generate more 
junk packets, thus resulting in junk data at the sink. 

A significant amount of research aims to prevent against or recover from pollution attacks [KFM04] , 
[CJL06], [GR06], [GR06], [ZKMH07], [YWRG08], [HLK+08], [JLK+08], [BFKW09], [KTT09], [AB09], 
[DCNR09], [ABBF10], [LM10], [YSJL10], and [WVNK10]. Ho et al. [HLK+08] attempt to detect at the 
sinks if the packets have been modified by a Byzantine node. They do so by adding hash symbols that 
are obtained as a polynomial function of the data symbols, and pollution is indicated by an inconsistency 
between the packets and the hashes. 

Jaggi et al. [JLK+08], for example, discuss rate-optimal protocols that survive Byzantine attacks. 
Their idea is to append extra parity information to the source messages. Kosut et al. [KTT09] provide 
non-linear protocols for achieving capacity in the presence of Byzantine adversaries. 

There has also been important work in the problem of detecting polluted packets when they are in- 
jected, see for example [KFM04], [CJL06], [GR06], [GR06], [ZKMH07], [YWRG08], [BFKW09] , [DCNR09], 
[ABBF10], and [WVNK10]. These schemes are helpful because they prevent polluted packets from mixing 
with other packets. The most common approach has been the use of a homomorphic cryptographic scheme 
(such as signature) [BFKW09], [ZKMH07], [KFM04], [YWRG08] , [CJL06], [AB09], and [ABBF10]. In a 
peer-to-peer setting, Krohn et al. [KFM04] propose a scheme based on homomorphic hashes to detect 
on the fly whether a received packet is valid. The homomorphic hashes are used to verify if the check 
blocks of downloaded files are indeed a linear combinations of the original file blocks. Gkantsidis and 
Rodriguez [GR06] further extend the approach of Krohn et al. to resist pollution attacks in peer-to-peer 
file distribution systems that use network coding. They also mention the entropy attack, which is similar 
to our diversity attack. However, they do not solve the problem of enforcing a Byzantine client to code 
diversely. Their approach is to have a node download coding coefficients from neighbors and decide from 
which neighbors to download the data to get the most innovative packets. However, a Byzantine client 
can still not code diversely, and for example, can choose not to code over the data from a parent that he 
knows would provide innovative information to his neighbors, thus reducing overall throughput. 

Wan et. al [WVNK10] propose limiting pollution attacks by identifying the malicious nodes, so that 
they can be isolated, and Le and Markopoulou [LM10] by identifying the precise location of Byzantine 
attackers using a homomorphic MAC scheme. 

Zhao et al. [ZKMH07] provides a signatures scheme for content distribution with network coding based 
on linear algebra and cryptography. The source provides all nodes with an invariant vector and public key 
information. With that information, all nodes can check on the fly the validity of a packet. [YWRG08] 
provides homormorphic signatures schemes for preventing such Byzantine attacks, but the paper is vacuous 
due to a flaw. [CJL06] and [BFKW09] also provide homomorphic signatures schemes, with a construction 
based on elliptic curves. This scheme augments the packet size by only one constant of about 1024 bits. 

Another recent approach to detecting polluted packets is the algebraic watchdog [KMB10, LAV10] in 
which nodes sniff on packets from other nodes and try to establish if they are polluted. 

However, while all these schemes only check if a packet is valid, they cannot establish if a packet is 
diverse. If Byzantine nodes are prevented from sending junk packets, because there are packet validity 
checks, it is still the case that there are other ways in which a Byzantine node can affect the throughput 
without violating any validity checks. For example, a Byzantine node can simply not send any data, he 
can forward one of the received packets (without coding), he can code with fixed coefficients, or he can 
choose coefficients that minimize the network throughput. In Section 6, we show that Byzantine behavior 
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of this kind does indeed significantly decrease throughput. All these behaviors are not considered (and not 
prevented) by all previous work on pollution attacks. 

3 Model 

We present the network model and then formulate the security problem that we want to solve. In Section 5, 
we explain how our model and protocols apply to a variety of problem domains. 

3.1 Network Model 

We consider a network where nodes perform random linear network coding [HKM + 03] over some finite 
field. Roughly, each packet is a pair consisting of a payload M and a coding vector C; nodes "code" by 
choosing random coefficients and using them to compute linear combinations of the received packets. For 
example: node N receives two packets (Mi, C\) and (Ma, C2); to random linear network code these packets, 
N chooses two random coefficients a\ and «2 from a certain finite field and computes the resulting coded 
packet as {a\M\ + a^Mi, o.\C\ + OL1C2) , where the computations are also performed in the finite field. In 
Section 4.1, we provide more details about the structure of a packet. 

The network is modeled as a directed graph in the natural way: each node in the network corresponds 
to a vertex in the graph, and if a node N sends data to another node N' then there is a directed edge in 
the graph from the vertex (corresponding to the node) N to the vertex (corresponding to the node) N'; we 
then say that N is a parent of N' and that N' is a child of N; similarly, if there is a directed edge from N' 
to N", we say that N is a grandparent of N" and N" is a grandchild of N. Each node sends one packet per 
time period to each of his children. 

We always denote a generic node in the network by N; he has parents denoted by Pi and children 
denoted by Cj. We denote by Vn the set of parents of N. As discussed, the required set of N, denoted 
by TZfi, is the subset of Vn indicating which parents the node N should code over. Ideally, the required 
set would be equal to the parent set, but this may not be possible in all settings or applications. (See 
Section 5 where we discuss various choices of the required set.) See Figure 2 for a diagram of a network 
using our notation. 




Figure 2: A source node S sends data to two destination nodes Di and D 2 . A generic node 
N somewhere in the graph has parent nodes Pi, P2, P3, and P 4 (of which Pi and P2 form 
his required set IZ^) and has children nodes Ci, C2, and C3. 

Each node N has a public key pk N and a corresponding secret key sku. We assume that each node 
knows the public key pk s of the source S; this is a reasonable assumption present in most previous work 
on pollution attacks [BFKW09], [ZKMH07], [KFM04], [YWRG08], [CJL06], and [ABBFIO]; for example, 
a node may be given this public key upon entering the system. 

In some settings (Section 5), we will need each node N to have a certificate certN that his public 
key is valid and belongs to it; cert|\i consists of a signature from the source or some other trusted party: 
sig(pk N , "this is the public key of TV"). A node need only obtain such a signature once per lifetime of the 
node and it can be performed, for example, when the node joins the network. 

In order for a child Cj to check that his parent coded correctly using the protocols that we present in 
Section 4, Cj needs to know what is the required set 7?.|\i of N and what are the public keys of the nodes 
in this set. Nodes do not need to know the required set (or the set of grandparents) for their parents a 
priori; in fact, dynamically adjusting the required set is important for dynamic networks. In Section 5.2, 
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we explain how nodes can acquire the required set for each of their parents depending on the application. 
We also explain for which applications our protocols are most fit and for which they are not fit. For now, 
assume that each nodes knows precisely the nodes in the required set of each parent. 

3.2 Threat Model 

Nodes in the network may be Byzantine (i.e., malicious or faulty): a node can pollute the data coming 
from the source by sending out a packet that is invalid or decrease the throughput by sending a packet 
that is not a result of coding over packets received from each parent in the required set. In Section 6, we 
discuss several Byzantine behaviors and how they affect the throughput of the network. 

Even worse, Byzantine nodes can collude among each other. A node can collude with his parents, 
children or any other node in the network to pass the verification tests at his honest children. 

We consider the adversarial model in which Byzantine nodes will use the best adversarial strategy 
to decrease the throughput at the sinks while still passing our verification tests. As already discussed, 
we assume that there exist penalties in place that create enough incentives for not cheating detectably; 
a discussion of what these penalties should be (e.g., a fine, an investigation, removal from the system, 
resource choking, reputation decrease, or making topology adjustments) is out of the scope of this paper 
and one should choose what best fits one's application. 

3.3 Solution Approach and Goals 

Similarly to prior work on pollution signatures, we also take a "verification test" solution approach. Our 
technical goal is to design a protocol that provably implements such a test for correctness: 

Verification test by node Cj when receiving packet P from node N. A procedure 
run by child Cj upon receiving a packet P from parent N to verify that node N generated P 
by coding correctly (i.e., using pseudorandom coefficients over a packet from each parent 
in the required set of N). 

If a Byzantine node N passes the verification test performed by an honest child Cj, the Byzantine node 
must have coded correctly over the required data. Therefore, such a verification test would achieve the 
goal of this paper, because each honest node in the network has the ability to enforce correct random 
linear network coding at each of his parents. 

Specifically, the verification test should satisfy the following properties: 

1. A Byzantine node that does not follow the random linear coding algorithm should be detected with 
overwhelming probability. 

2. The test must be efficient with respect to computation and bandwidth. 

3. The verification test must be collusion resistant: an honest child should be able to check if his 
parent coded over all the honest nodes in his required set, regardless of whether other children or 
grandparents are Byzantine or not. 

4. If the verification test fails, it is possible to prove it. In particular, this implies that a node can, not 
only detect, but also prove, when a parent cheats. 

We require that the computational overhead that each node incurs by running the verification test is 
reasonable and, moreover, we also require that the increase in packet size (due to the extra information 
sent to later nodes in order to enable them to run the verification test) does not depend on the payload of 
the packet. (Recall that network coding is particularly useful when the packet payload is large and the 
overhead of the coefficients becomes negligible.) 

The protocols we propose (and which are presented in Section 4) achieve the above four properties. 

We remark that tackling collusion is challenging. For example, a node N could collude with a child 
Cj: N could send a packet to Cj that is not the result of coding over all the nodes in the required set 
with pseudorandom coefficients, and Cj would simply neglect running the verification test on N. Still, 
we want to ensure that the other, honest children of N can verify that they do receive correctly-coded 
packets. This means that each child node Cj must be able to independently check N and not rely on any 
shared information that is required to stay secret. Similarly, ideally, if some parents collude with N, N's 
children should still be able to check that N coded over all the required parents that did not collude with 
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N. This means that the parents cannot have some secret shared data in the protocol, all of this making 
the cryptographic protocol more challenging. 

Finally, while the network model that we adopt is simple, we show in Section 5 that it is expressive: 
there we explain how to use this model for a variety of network settings and applications, either directly 
or with simple extensions. 

4 Protocol 

We describe the protocols a node needs to run to perform the verification test on each of his parents and 
to assemble packets to send to his children. For clarity, we present the protocols in an incremental fashion, 
by successively adding more security properties. But first we will need to introduce some basic notation 
and cryptographic tools that we use. 

4.1 Notation 

A sequence (or tuple) of n components x\, ■ . . , x n is denoted by (x-y, . . . , x n ) or (xi)2 =1 ; for simplicity, some- 
times we omit the starting and ending indices of the sequence, thus only writing (xi)i. The concatenation 
of two strings a and b is denoted by a\\b. 

We denote by \1Zn\ the number of (parent) nodes in the required set IZ^ of node N; by pk N and sI<n 
the public and secret keys of node N; and by s\g N (x) a signature of a message x with respect to the 
key pair (pk N ,sk N ) of N, where the underlying signature scheme is assumed to satisfy the usual notion of 
unforgeability (i.e., existential unforgeability under chosen-message attack). For concreteness, we use the 
DSA algorithm [NIS], whose signatures are only 320 bits long. 

Let q be the prime number used in any of the pollution signature schemes in [BFKW09] , [ZKMH07] , 
[KFM04], [YWRG08], [CJL06], and [ABBF10]. For example, in [BFKW09], q is a 160-bit prime. 

In network coding, as already mentioned, a packet has the form E = (M, C), where M is the payload and 
C the coding vector. (In our protocols, we will augment the packet with additional tokens.) The payload 
M is an n-tuple (mi, . . . , m n ) of chunks, where each chunk m» is an element of Z*, the multiplicative group 
of integers modulo the prime q. A coding vector C is an m-tuple (ci, . . . , c m ) of chunks, where each chunk 
is also an element of Z*. Hence, E consists of n + m chunks e±, . . . , e n + m , where e\ — m{ for 1 < % < n 
and ei = Ci^ n for n < i < n + m. In particular, we can think of M, C, and E as vectors in some product 
space of Z* . 

4.2 Cryptographic tools 

We now briefly review the cryptographic tools that we employ in our protocols: 

Pseudorandom functions. Informally, a pseudorandom function family is a family of polynomial- 
time computable functions {F s : {0, 1}' S ' — > {0, l}' s '} s e{o,i}* with the property that, for a sufficiently 
large security parameter k and a random fc-bit seed s, F s "looks like" a random function to any efficient 
procedure. See [GGM86] for more details. 

Merkle hashes. A Merkle hash [Mer89] is a concise commitment to n elements. Suppose that Alice has 
n elements and she gives Bob a Merkle hash of them. Later, when Bob asks to see some elements from 
Alice, the Merkle hash allows Bob to check that indeed the elements Alice gives him are the same elements 
over which she had computed the Merkle hash. Loosely, to compute the Merkle hash of n elements, Alice 
places the elements at the leaves of a full binary tree; she recursively computes each node higher in the 
tree as the hash of the concatenation of the two children. The resulting hash at the root is called the 
Merkle hash/ commitment of the n elements. Given n elements and their Merkle hash, Alice can reveal an 
element, say element i, to Bob by revealing the label of every node (and his sibling) along the path from 
the leaf node containing element i to the root; Bob verifies the correctness of element i by re- hashing the 
elements bottom-up and then verifying that the resulting hash is equal to the claimed Merkle hash. The 
advantage of the Merkle hash is that Bob only needs to ask O(logn) elements from Alice to check that a 
element out of n has been correctly included in the Merkle hash. See [Mer89] for more details. 

Pollution signatures. A pollution signature scheme (such as [BFKW09], [ZKMH07], [KFM04], [YWRG08], 
[CJL06], and [ABBF10]) is a signature scheme consisting of the usual triplet of algorithms (gen, sig, ver) 
with a special homomorphic property that allows it to be used to detect pollution attacks in network 
coding. 
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Specifically, the source S runs the key generation algorithm gen to produce a secret key sks, together 
with a corresponding public key pk s that is published for everyone to use. The source S augments each 
outgoing packet E with a special signature as(E), generated by running the algorithm sig on input the 
secret key sks an d the packet E; we refer to this special signature as a validity signature of the packet E 
with respect to the public key pk s . 

When a node receives a (signed) packet (E,as(E)}, he verifies the signature on the packet, by running 
the algorithm ver on input the public key pk s , the packet E, and the signature as(E). 

Pollution signature schemes have the useful homomorphic property that, when given several packets 
together with their validity signatures, any node is able to compute a validity signature of any linear 
combination of those packets, without communicating with the source S. For example, if a node N re- 
ceives two (signed) packets (Ei,crs(Ei)) and (E2,os(E 2 )), then, for any two coefficients a and f3, N can 
compute a validity signature of the packet E = aE\ + ^E%\ in some schemes, this is done by computing 
as(a.E\ + (3E 2 ) = as(Ei) a ■ as{E 2 Y , where each of these computations are performed in a certain field 
and the equality holds due to homorphism. See [BFKW09], [ZKMH07], [KFM04], [YWRG08], [CJL06], 
and [ABBF10] for more details. 

4.3 A Generic Protocol 

In order to avoid repetition in the presentation of our protocols, in this section we introduce the general 
structure that will be followed by each protocol version that we present; later, in any given protocol version, 
we will replace any unspecified quantities or procedures with concrete values or algorithms. 

First we discuss the new packet structure: every packet E transmitted by a generic node N is augmented 
with three cryptographic tokens; the first token has already been used in prior work, while the last two 
tokens are new to our protocols: 

1. A validity signature a$(E), which is used to prevent pollution attacks. Any (secure) pollution 
signature scheme [BFKW09], [ZKMH07], [KFM04], [YWRG08] , [CJL06], and [ABBF10] may be 
used to produce this signature (as we rely only on the guarantees it provides and not on details of 
its implementation). 

2. A test token Tn, which is used by each child Cj of N to run the verification test on N, denoted 
VerifTest. 

3. A helper token H^, which is used by each child Cj of N to produce his own test token JX, using a 
procedure called Combine. 

Specifically, the protocol that a generic node N runs, after receiving packets from his parents, in order to 
produce a packet for each of his children, takes the general form of Algorithm 1, where the procedures 
VerifTest, CheckHelper, and Combine, as well as the value of H N , will be specified later: 



Algorithm 1 Protocol at a generic node N 
1: From each parent node € Pn, node N receives a packet (Ep i: as(Ep i ),Tp i , Hp ( ) . 
2: For each parent node Pi G Pn, node N verifies that as (Ep i ) is a valid signature of Ep i using the public 
key pk s of the source S, verifies that VerifTest (Ep., as(Ep.),Tp.) accepts, and that Hp. is correct 

using CHECKHELPER(i?p.,CTs(£;p.),i?p i ) . 

3: Node N computes E^ by coding over all Ep i and o"s(-E|\i), as described at the end of Section 4.2. 

4: Node N computes T N = Combine ((E Pi , as (E Pi ), H Pi )p i( z-p N ) and H N . 

5: Node N assembles the packet (E^, as(E^), Tn, H^), and sends it to each child Cj. 



In Step 2, for each parent Pi from which N receives a packet: N checks the validity signature of the 
packet to establish whether Pi sent polluted data or not; then, N checks the test token Tp i by running the 
verification test to establish that Pi coded correctly; next, N needs to make sure Pi sent a correct helper 
token (without which N could not compute a good test token Tn himself and would fail the verification 
test at his children). 

If any of the checks above fail, N will report them and act in some way that is application-specific. As 
we will see in Section 4.7, N can accompany his complaint with a proof that his parent cheated. 

In our protocol, each node verifies his parents (if he is not the source) and is being verified by his 
children (if he is not a sink/destination). Thus, N verifiers Pi, and Cj verifies N. 
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4.4 How to Force Byzantine Nodes to Code Over All Required Packets 



As a first step, we design a verification test that enables any child Cj of a node N to check that N did 
indeed code over all of the parent nodes in his required set TZn, i.e., that the packet sent by N to Cj is a 
linear combination of packets from parents in the required set with coefficients all of which are not equal 
to zero. 

A naive solution. The node N can simply forward to each of his children all the packets received from 
parents in the required set. Of course, N's parents make sure to sign (using their own secret keys) the 
packets they send to N, so that Cj can be sure that the packets forwarded by N are indeed from N's parents. 
In other words, N forwards to each child Cj the following data: Ep., as{E P .), and sig P .(i?pJ|(Ts(i5p i )), the 
coding coefficients used for the packets from each parent, and the newly coded payload E^ with the new 
integrity signature as(E^). Each child Cj can then establish whether N coded correctly, because he now 
has access to all the information N received from his parents and can thus check that N did not use any 
zero coefficients. 

Clearly, this solution is bandwidth inefficient: the payload of the packet can be very large and N will 
send |7^n| + 1 such payloads to his children, reducing throughput \TZ^ \ + 1 times. 

Payload-Independent Protocol (PIP). We now improve on the naive solution, by avoiding to in- 
clude the packet payload in the test token sent for verification, thus saving considerable bandwidth and 
throughput. 

Each parent Pi sends a helper token consisting of a parent signature on the validity signature: 



The text in Hp. prevents a colluding N from giving this helper token to some other node IM', which could 
otherwise falsely claim that he received the data from P^. 

The test token Tn of node N is computed by a simple concatenation; specifically, Combine computes 
the following test token: 



where ai is the coding coefficient that N used for the packet from P^. 

The verification test for this version of the protocol is given in Algorithm 2. 



Algorithm 2 VerifTest of node N, by child Cj 
1: for each parent P^ £ Vu do 

2: Cj checks that Tn contains an entry (ci, as(Ep i ), Hp A for the current parent P^. 
3: Cj checks that H Pi verifies with pk P . as a signature of as{E Pi ). 
4: Cj checks that a.i ^ 0. 
5: end for 

6: Cj verifies that combining all validity signatures as(E P .) and coefficients en results in cts(-E'n), according 
to the homomorphic property discussed in Section 4.2. 



Step 2 verifies that N provided test data for parent P^. Step 3 checks that the data is authentic. Step 
4 establishes that the coefficient used in coding over this parent is nonzero. Step 6 checks that the coded 
data from N indeed corresponds to coded data over the information from the parents with the claimed 
coefficients a t . 

We now give some intuition for why Algorithm 2 is a good verification test, leaving a formal proof to 
Section 4.8. If N does not code over the packet of some parent, say from parent Pi, in order for N to 
produce a validity signature as(E^) for E^ that verifies successfully under the public key pk s , N needs 
to combine only those validity signatures from parents he coded over and not include &s(E Pl ) in the 
computation; at Step 6, however, Cj uses the validity signature from all required parents to check as(E^) 
(with a coefficient that was checked to be nonzero in Step 4) and the check would fail. 

CheckHelper at Cj consists of checking that H^ is indeed a signature on E^ and is not a signature 
on zero. 

The length of the test token Tn is now 



HPi ■= sig Pl (o-s(^pJ || "from Pi to N") . 




|T N | = |ft N |-(ks| + |sig|) 
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B 1 = (h(A Pl ),a 5 (a Pl E Pl )) B 2 = (h(Ap 2 ), a s {a Pl Ep 2 )) 



Bi, 2 = (h(B 1 \\B 2 ),(T 5 (ap 1 Ep 1 +ap 2 E P ,)) 



-Bi,2,3,4 = {h{Bi t2 \\B 3>i ),as{ap 1 Ep l + otp 2 Ep 2 + ap 3 Ep 3 +ap 4 Ep 4 )) 




b i,...,|tc n | = \Tti,(Ts{En)) 



Figure 3: Diagram representing the computation of the helper token T^: for each parent 
Pi , define Ap . = a$ {Ep. ) 1 1 Hp. \ \ ap i ; then recursively apply the hash function h as indicated, 
and recursively compute pollution signatures with the indicated coefficients. 



where jo's I denotes the size of the pollution signature and |sig| the size of the signature scheme introduced 
in Section 4.1. Indeed, the length of In does not depend on the payload any more. Also, recall that the 
lengths of the signatures are constant. Note, though, that |Tn| is linear in the number of parents; this may 
not be a problem, but in applications where the payload is not that large or where there can be many 
parents, it would be desirable to have a smaller token. Moreover, verifying \TZ^ | digital signatures in the 
verification test (Step 3 above) will become expensive if the number of parents is not small. 

Logarithmic Payload-Independent Protocol (Log-PIP). We provide a second protocol in which 
the length of the helper token Tn is significantly shorter: 

|T N | = H + ks| + |sig| + 2> s |-log(|ft N |) , 

where \h\ is the size of a hash (e.g. 160 bits for SHA-1), thus replacing \TZ^\ with the much smaller value 
log( |72.|\| | ) (where the logarithm is base 2). The second protocol that we present, however, is probabilistic in 
its guarantees: rather than enabling a child Cj to test if a parent N cheated (with overwhelming confidence), 
we enable Cj to detect misbehavior of N with a certain (adjustable) probability. 

Specifically, after receiving the packet from N, node Cj picks a required parent of N at random and 
challenges N to prove that he coded correctly over that parent. Of course, N does not know ahead of 
time on what packets he will be challenged. As shown in Section 6, such probabilistic approach is still 
quite effective because the chance that a Byzantine node N is detected cheating grows exponentially in the 
number of times he attempts to cheat. In Section 6, we provide recommendations for when we believe it 
is more appropriate to use PIP or Log-PIP. 

The basic idea of Log-PIP is that N will send to Cj a test token Tn that is the root of a Merkle hash tree 
constructed over the data of the test token used in PIP; namely, a Merkle hash tree where the elements at 
the leaves are the tuples (aj, os(Ep.), sig p . (as{Ep.) || "from P^ to N")) ranging over the parents P^. Each 
Cj will then challenge N by asking to see a certain path in the Merkle hash tree corresponding to a parent 
of N. In this way, Cj can check if N coded over that parent (i.e., N used a non-zero coefficient). Of course, 
N cannot provide arbitrary data to Cj when replying the challenge of Cj, as guaranteed by the security 
properties of a Merkle hash. Therefore, if N did not code over a parent, Cj will discover this with a known 
probability. 

Let ft, be a hashing scheme. Figure 3 illustrates the Merkle tree that N has to compute and provides 
notation for our discussion. We slightly modify the traditional Merkle hash, by adding data at internal 
nodes and changing the recursion. Each leaf node Ap i in the Merkle tree consists of a "summary" of the 
data from a required parent P^ of N . Each internal node consists of the validity signature obtained by coding 
over all the packets at the leaves of the subtree rooted by the internal node and a hash of the two children. 
The root node will thus contain the test token Xj\j as the root hash and the validity signature over E^, 
namely cs^n)- Thus, Combine consists of computing the Merkle hash to obtain the Merkle hash root, so 
that Tm = "Merkle hash root". Hp i is the same as in PIP, that is, Hp i = sig P . (<7s(-E?pJ II "from P^ to N"). 

The verification test VerifTest is ran in a different way than in PIP. Each node Cj receives the 
packet from N, checks the validity signature, and he can proceed to code and forward the packet. It 
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can then challenge node N to check if N indeed coded over all the packets. During a challenge, only 
log|7?.|\i| source signatures and hashes will be retrieved, due to the Merkle tree property. Moreover, only 
one digital signature will be verified, the one corresponding to the parent from the challenge. For the 
Merkle recursion, only hash verifications and homomorphic signature operations (which typically consist 
of multiplying 1024-bit numbers) will be performed, so the overall cost is dominated by a single digital 
signature verification. 

The number of challenges is selected based on the desired probability of detection. With t challenges, 
there is a probability of £/|7?.n| of detecting that N did not code over a parent. After r transmissions in 
which N cheats, the probability of detecting N is at least 1 — (1 — t/d) r (this is achieved when N cheats 
minimally - by not coding over one parent), which increases exponentially in r. Coupled with penalties, 
such a probabilistic approach offers incentives against cheating. 

Node N needs to remember the values that constituted the Merkle tree until the children finished 
challenging it. One challenge checks that the node coded correctly over a parent; multiple challenges can 
be sent at once and processed together. 



Algorithm 3 Challenge on node N by node Cj 
1: Cj picks a parent Pj of N at random and informs N of the choice. 

2: N must present Ap i (defined in Figure 3) and all values of the nodes in the Merkle tree that are siblings 

to nodes on the path from Ap i to the root and their siblings. 
3: Cj runs VerifTest: 

i verifies that Hp. is a correct signature over a s (Ep.) using pk P . and that a,{ ^ 

ii verifies that the validity signature in Ap i combined with on is the same as the validity signature 
in Bi, 

iii the validity signature at each internal node is a multiplication of the validity signatures at the 
children of the node, 

iv recomputes the Merkle hash based on the hashes provided by N and checks equality to T N , 

v checks that the validity signature, provided when N initially transmitted, verifies the validity 
signature at the top of the Merkle tree, 

vi checks to be a signature using pk N on as(E^). 



As for CheckHelper, Cj still needs to check that H^ is indeed a signature on E^ and is not a signature 
on zero to prevent N from causing Cj to fail the verification test at Cj's children. Proofs of security for 
this protocol are included in Section 4.8. 

Collusion. Both PIP and Log-PIP are collusion resistant: even if a child colludes with N, the other 
children check N independently. Moreover, if N colludes with a parent Pi, N still needs to code correctly 
over the rest of the parents that he did not collude with because he cannot forge these parents signatures 
if N has at least one honest child verifying it. 

4.5 How to Force Byzantine Nodes to Code Pseudorandomly 

As a second step, we design a verification test that enables any child Cj of a node N to check, not only 
whether a packet received from N is valid and derived using non-zero coefficients over each parent in his 
required set (as was guaranteed by the solution presented in Section 4.4), but also whether the node N 
coded using (pseudo)random coefficients in Z*. 

The basic idea is to require node N to generate the pseudorandom coefficients from a seed that is also 
known to each child Cj, so that each Cj will be able to generate these same coefficients and use them as 
part of his verification test on N . 

We assume that each client knows a random seed s that is public; a trusted party drew the seed at 
random when the system started. For example, a client can learn about the seed s when he joins the 
system. In a wireless setting with no membership, a node can either have s already hardcoded, or he 
can obtain it from his neighbors (s can be accompanied by a signature from a trusted party to make sure 
that malicious neighbors cannot lie about its value) . The seed can remain the same for the lifetime of the 
system. 

Using the seed s, the coefficients can then be generated using a pseudorandom function F s (defined in 
Section 4.2). For each parent Pj in the parent set Vn, the node N computes a* — F s (Pi\\N) (of course, 
mapped to the field of the coefficients) and uses a* as the coding coefficient for the packet from Pj. 
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Observe that, contrary to what the definition of the pseudorandomness property [GGM86] prescribes, 
the seed s is not kept private, but is instead made public. Of course, in such a case, one cannot expect that 
the input-output relation induced by F s is unpredictable; indeed, it is deterministic, because now F s may 
be computed by anyone (and is not an "oracle" anymore). Nonetheless, since in our setting the inputs 
to F s are not under the control of Byzantine nodes, and are predetermined, it is easy to show that the 
outputs of F s on these inputs will still retain the statistical properties that we are interested in, allowing 
for the network throughput to still be maximal using these "pseudorandom" coefficients. 

If one wishes to enable N to use a different set of coding coefficients for each child Cj , the computation 
of the coding coefficients can be changed to a*j — F s (Pj||N||Cj); thus N must use a* • to code over the data 
from Pi when preparing a packet for child Cj . Intuitively, using different coefficients increases throughput 
in some topologies because of more diversity; this can be helpful in P2P networks, for example, but not 
so much in a wireless setting where transmitting different data to children will not take advantage of the 
shared medium on which multiple children can listen. 

The verification tests in previous sections can now be easily modified to have each child Cj check that 
N coded over each parent in the required set with this exact coding coefficients a* (or a*,): in Step 4 of 
Algorithm 2 and in Step 3i of Algorithm 3, node Cj must check that on equals a* (or a* A With this 
check in place, Byzantine nodes are forced to code with pseudorandom coefficients. Section 4.8 shows that 
Byzantine nodes cannot code with different coefficients and pass the verification test. 

4.6 How to Prevent Replay Attacks of Old Data 

One problem is that a Byzantine client may code correctly for one transmission, but may attempt to 
cheat on the next transmission by sending the old data he sent for the first transmission. In some cases, 
such a strategy reduces throughput, but in others, it even pollutes packets downstream in the network. 
Nevertheless, the Byzantine client will pass any pollution test because the source uses the same keys for 
signing in both transmissions; the node will also pass our diversity tests above because he coded correctly 
over his parents in the first transmission. 

Therefore, we need to prevent such replay attacks. In fact, the problem of replay attacks belongs to 
the use of pollution schemes and is not introduced by our diversity enforcement scheme. Any solution for 
that setting will suffice in our setting as well because of the way we build "on top" of validity signatures. 
Therefore, any overhead introduced by such a scheme already is introduced by the use of pollution schemes 
and does not come with diversity enforcement. 

We propose one such replay solution. The idea is to have the source change the validity signature key 
with every transmission so that any attempt by a Byzantine client to use old data would be detected when 
checking the validity signature. Let (skg.fc, pk s k ) denote the public key used by the source in the fc-th 
transmission. The source has one master signing key pair of which the public verification key is known 
to all users as before. To inform nodes of the public key used during a transmission, the source will send 
with every packet this public key accompanied by a signature of this public key using the master signing 
key. The source signs the public key to prevent malicious clients from forging public keys of their own and 
claiming they belong to the source. 

For our diversity scheme, we make use of the public key corresponding to each transmission to add diver- 
sity in the coding coefficients across transmissions. Each node should now code with a* — F s (Pj||l\l||pkg fe ) 
and their children will check the inclusion of pk s k in the coding coefficients along with the other tests they 
perform; without pk sfc , the coding coefficients will be the same across different transmissions. 

4.7 How to Enable Nodes to Prove Misbehavior 

We discuss how any child Cj of a node N can prove N's misbehavior to a third party, when the verification 
test for N fails. Recall that the ability to convince a third party (such as the source, a membership service, 
or other authoritative agents in the system) that N did indeed misbehave is important to allow for punitive 
measures to be enacted. Furthermore, the ability to prove misbehavior reinforces the deterrent effect of 
verification tests. 

We use signatures in a natural way to provide such proofs: Step 5 of Algorithm 1 is modified so that 
a node N attaches an additional "attest" token to the packet he sends to his children; the attest token 
consists of a signature of the whole packet under his own secret key sk^. Each child Cj of N will then 
verify this signature (and ignore any data from N that does not carry a valid "attest" signature). 

If a child Cj establishes that his parent N did not code correctly based on the verification tests in 
Algorithm 2 or Algorithm 3, he can provide the packet from N together with his attest token as proof to a 
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third party. Any other party knowing the required set IZ^ of node N can run the VerifTest procedures 
to establish if N cheated. Of course, by the unforgeability property of the signature scheme, children of N 
cannot falsely accuse N of misbehavior. 

4.8 Proofs of Security 

Theorem 4.1 (Security of PIP). In protocol PIP, if a generic node N passes all checks at an honest 
child C,-, it means that N coded over the value from Pi, Ep 1} with precisely coefficient c\ (as described in 
Section 4.5), where Pi is any generic parent from N's required set. 

Proof. Algorithm 2 gives VerifTest for the PIP protocol. If N passes the checks in Step 2, it means that 
N provided the triple (ci, cr(£'p 1 ), Hp^) in Tn; if N passes the checks in tep 3 and Step 4, it means that Pi 
indeed provided a^Ep^ and C\ ^ 0; if N passes the check in Step 6, it means that N computed a(E^) by 
including a(Ep 1 ) with coefficient ci in the homomorphic computation (described in Section 4.2). 

In Step 2 of Algorithm 1 when run by Cj, the node Cj checks that a(E^) verifies as a signature of 
N. By the theorem's hypothesis, the pollution signature verifies, so that, by the security of the pollution 
scheme (detailed in [BFKW09]), it must be the case that N included ci • Ep 1 when computing E^. □ 

Theorem 4.2 (Security of Log-PIP). In protocol Log-PIP, if a generic node N did not code over any 
given parent, say Pi, from his required set with coefficient c\ (as described in Section 4.5), and an honest 
child Cj challenges N on t random parents, the probability that N is detected (some check fails) is at least 

Proof. The strategy of the proof is to present some exhaustive cases in which N could not have coded 
over a parent, and show that in each such case the probability of detection is > £/|7?.n|. Consider the 
tree TreeN of values that N used when he computed the Merkle hash that he gave to Cj. Because of the 
Merkle hash guarantees, N cannot come up with any other tree (that is not a subtree of Treei\i), that 
has the same Merkle root hash. If any leaf i in this tree (if a leaf exists) does not satisfy check (i) in 
Step 3 of Algorithm 3, it will be caught if Cj challenges N on parent P^, which happens with probability 
£/|7£n|' Similarly, if any internal nodes Bi do not satisfy check (ii), Cj will detect this with probability at 
least t/\TZti\. Therefore, we can assume that the first level of internal nodes in the TreeN consists of the 
expected hashes and a^CiEp^ where Ci is the desired coefficient and a{Ep i ) is indeed the validity signature 
from parent P^. If any internal node in TreeN does not satisfy check (iii), this will be detected whenever 
N is challenged on a value i that involves a path through the Merkle tree passing through the broken 
internal node; this happens with probability at least t/|72.N|- Therefore, assuming all internal nodes pass 
check (iii), it means that the validity signature at the top of the tree must be cr(^ i Ci-EpJ. If the validity 
signature at the top of TreeN does not match the one initially provided by N (i.e., a(E^)), check (v) will fail 
with probability 1. Assuming, this check succeeds it must be the case that the validity signature initially 
provided by N is a proper validity signature after coding with Cj over all Pj. Since the validity signature 
matched E^ (check (2) of Algorithm 1 when run at child Cj), it means that N coded over all parents with 
the right coefficients, by the guarantees of the validity signature. Therefore, there are no more cases of 
possible cheating from N to consider and since all previous types of cheating were caught with chance 
> i/|7?.|\i|, the proof is complete. □ 

5 Applications and Extensions 

In this section, we describe applications and extensions of our protocol. 
5.1 Types of Required Sets 

In our protocols so far, we considered that a child of node N performs the verification test on a specific 
set of required parents for N. However, one can use different types of verification tests, some being more 
useful for certain settings, as wc will see. All these verifications, in fact, just map to verifying a specific 
required set as before. 

A child Cj can perform any of the following checks for node N: 

(1) N coded over all his parents or over a specific set of parents. 
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(2) Threshold enforcement: N coded over at least d parents. This check can be enforced by having N send an 
indication of which parents he coded over with their public keys and certificates (defined in Section 3.2): 
Cj checks that these are at least d in number, checks the certificate of each public key to make sure N 
did not falsify these keys, and that N indeed coded over them. 

(3) N coded over at least some subset of parents. This is a combination of Item 1 and Item 2. Cj checks 
that N coded over the subset of parents as in Item 1 and over some valid parents as in Item 2. 

(4) N coded over a set of parents with some application-level property. For example, N must code over at 
least two parents noted by some application as high priority and at least five parents in total. The 
priority of each node Pi can be included in the certificate certp ; . N again indicates the nodes he coded 
over to Cj along with their public keys and certificates, and Cj checks that at least two certificates 
contains high priority and there are at least five in total. Other general application semantics can be 
supported by this verification case. 

5.2 Applications and Required Sets 

In this section, we describe the various settings to which our protocols are applicable, and how the nodes 
would learn of the required set of their parents. 

Our model applies to settings in which a node can learn the required set of his parents, such as: 

1) Systems with a membership service: the membership service can inform a node of his grandparents when 
the node joins and when changes occur. Some peer-to-peer and content distribution systems fall in this 
category. 

2) Systems having a reliable yet potentially low capacity channel besides the channel where the coding 
occurs (which may be less reliable, but has higher capacity): the reliable channel can be used to commu- 
nicate topology changes between nodes. Some examples of applications are decentralized peer-to-peer 
applications and content distribution, as well as some wireless networks. 

3) Static topologies: these topologies do not change or change rarely. The topology is mostly known to the 
nodes (e.g., nodes can discover it when joining), so a node will know his grandparents. Wired as well 
as some wireless network applications fall in this category. For wired networks, since the topology is 
more static and delays tend to be lower, more aggressive verification tests can be implemented (e.g. the 
required set is most of the parents or all of the parents, depending on the particular system). 

4) Moderately dynamic wireless topologies: the set of grandparents for a node may change many times, after 
each change, it remains the same for enough time allowing the node to discover the new grandparents. 

Let us discuss how a child can learn about his changing grandparents in dynamic topologies. First of all, 
for such topologies, we recommend nodes use the threshold enforcement scheme (described in (Item 2 
above) because the set of parents of a node changes dynamically. The threshold should be adjusted 
based on some minimum number of links a node is expected to have in order to code diversely. 

Consider that parents of node N have changed and child Cj wants to learn about this. We use the same 
links used by packet flow to inform Cj of his grandparents. Each new parent Pj sends N: his public key 
and the corresponding certificate certp^ N sends this information to Q. Let's discuss the case when 
N is malicious and may try to inform Q of incorrect parent list. Note that N cannot lie that Pi is a 
parent when he is not because, if N does not have a link to Pi, during transmission time, nodes Q will 
verify that N coded over the data from Pi which N could not have done because he did not receive this 
data. Moreover, N cannot create some public keys of his own and claim that some parents with those 
public keys exist, because each node key has a certification as discussed. On the other hand, N may 
try to simply not report any of his parents so that he does not have to forward or code over any data. 
However, each child Q will expect N to report at least a threshold of parents; if N does not do so, Q 
can be suspicious and denounce N of potentially being malicious, as discussed in Section 3.2. Therefore, 
N can choose which d parents to code over from the set of parents physically linked to it, but he cannot 
choose less than d such parents. 

However, our scheme would not work well for highly changing topologies that also do not fall under 
any of Item 1 or Item 2. Such an example are military ad-hoc wireless networks where the nodes are in 
constant rapid movement; this would not allow a child to discover his grandparents effectively. 
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5.3 Extensions 

In this section, we describe how our protocol could be applied to other network coding scenarios. 

First, note that we did not make any assumption about what a link or a node really is. A link can be 
a physical link, a chain of physical links, or even a subnetwork. For example, in a peer to peer network, 
a link can include an entire subnetwork via which some peers send data to a receiving peer. In this case, 
our protocol can be used to check that the receiving peer coded over all sender peers when he forwards the 
packets to some other peer. As another example, a link in a wired network may represent a connection, 
while a link in a wireless network may be the ability to hear/communicate with another node or be an 
edge induced by the data transmission graph. Moreover, a node can be a physical node (a router, a peer 
in a P2P network) or a subnetwork; in fact, a few nodes in our model can form one node for a certain 
system. Using these observations, we can express constraints of real-world networks: 

Multiple packets may be sent on some links. Consider that parent Pj has a capacity of p packets 
on the link to node N. In this case, in our protocol, Pi will be represented as p different nodes, each with 
a different public key. With this transformation, our protocol can be used unchanged. 

Broadcast links. Broadcast in wireless can be mapped to our model by having the parent have one link 
(the same link) to all his children (basically, viewing all children as one child), and our protocols can be 
applied unchanged. 

Multi-source network coding. In the multi-source network coding case, intermediate nodes combine 
packets for different files from different sources, but each source operates independently and may not 
communicate with the others. In such work, the metadata of the packet is augmented with information 
about which source and which file identifiers the current packet contains. 

To support our protocols in the multi-source case, note that PIP and Log-PIP depend on source 
information only when checking validity signatures. Moreover, our protocols are built modularly on top of 
a validity signature and do not depend on any particular scheme. This means that all we need is a multi- 
source validity signature and the rest of the algorithms will remain unchanged. Recent work [ABBF10] 
proposes such schemes: sources can send packets independently of each other, each packet contains a 
validity signature, and these signatures can be checked at each intermediate node by knowing the public 
keys of each of these sources. Children will be able to check if their parents coded over the appropriate 
grandparents as before. 

Asynchronous networks and delay intolerant networks. A child may receive data from his parents 
at different times. For efficiency reasons, the child may have to code over the data that he received already 
and send the data forward, and not wait until a piece arrived from every parent. In this case, the child N 
can enforce the threshold verification above, thus checking that the packet from N is coded over at least a 
few parents. 

Various levels of abstraction. Our protocol can be used at various levels of abstraction. For example, 
in peer-to-peer networks, nodes can perform: 

• End-to-end check. A peer can check that the data from another peer is the result of coding over 
the data of all of certain sources, even if those sources communicated with the tested peer via other 
nodes or networks. 

• Individual node check. A peer can check that the data from another peer is the result of coding over 
all of certain peers to which this peer should be connected to according to the Peer-to- Peer algorithm 
they run or whatever application they run. 

A lot of P2P systems are taking advantage of smartphones nowadays. In Section 6, we show that our 
protocol is efficient even when run on a smart phone such as Android Nexus One. 

6 Implementation and Evaluation 

In this section, we evaluate the usefulness and the performance of our protocol. 
6.1 Simulation 

We run a Python simulation to show that there is significant throughput loss due to Byzantine behavior not 
detected in previous work, but detected in our protocols. We examined three types of node behavior: (Mode 
1 ) Byzantine nodes choose coding coefficients such that their packet does not provide new information at 
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Figure 4: (a) One and (b) ten Byzantine nodes on the mincut. 

their children; (Mode 2) Byzantine nodes simply forward one of the received packets (and do not code); 
(Mode 3) Byzantine nodes are forced to code with pseudorandom coefficients. We can see that neither 
Mode 1 nor Mode 2 are detected by prior work on pollution schemes, but both are detected by our 
protocols. Mode 3, which is the correct behavior, is enforced only by our protocols. 

The simulation constructs a graph by assigning edges at random between nodes, but maintaining the 
given minimum cut. The Byzantine nodes are placed on the minimum cut. We ran the simulation for [50 
nodes, 1000 edges, 5 packets sent from the source, min-cut up to 10, 1 Byzantine node] and [100 nodes, 
2000 edges, 20 packets send from the source, min-cut value up to 20, 10 Byzantine nodes]. Figure 4 shows 
the throughput (i.e., the degrees of freedom) at the sink plotted against the min-cut value. We can see 
that the throughput difference between Modes 1/2 and Mode 3 is significant. Moreover, when the min-cut 
value of the network is small (e.g., 5), the throughput increase when using Mode 3 can be as large as 
twice (see min-cut value of 3 in Figure 4(a)). In Figure 4 (b), we can see a more significant throughput 
difference. Mode 3 has a throughput of about 10 degrees of freedom more than Mode 1 (which is 50% of 
the data sent by the source) and about 5 degrees of freedom more than Mode 2 (which is 25% of the data 
sent by the source). 

6.2 Implementation 

We implemented our protocol as a library (called SecureNetCode) in C/C++ and Java, as well as embedded 
it into the Android platform. The C/C++ implementation is useful for lower level code that is meant to 
be fast: network routers, various wireless settings, and other C/CH — h programs. The Java implementation 
is useful for higher-level programs such as P2P applications. We embedded the Java implementation in the 
Android platform and ran it on a Nexus One smartphone. The reason is that, with the growing popularity 
of smartphones, more P2P content distribution applications for smartphones are developed, some using 
network coding ([Harll], [Fit08]). 

Our library implementation is available at www.mit.edu/~ralucap/netcode.html . It consists of the 
functions in protocols PIP and Log-PIP. Our library in C/C++ consists of 290 lines and the one in 
Java consists of 274 lines including comments and white lines, but excluding standard, number theory or 
cryptographic libraries. To implement certain cryptographic operations on large numbers, we used NTL 
in C/C++ and Biglnteger in Java. As cryptographic algorithms, we used OpenSSL DSA and SHA. The 
size of the validity signature used is 1024-bit. 

Results. Except for the Android results which were run on a standard Nexus One smartphone, the rest 
of the results were run on a dual-core processor with 2.0 GHz and 1 GByte of RAM. There was observable 
variability in the results (especially for Nexus One), so we ran the experiments up to 100 times to find an 
average time. 

Note that we only evaluate the performance of our diversity scheme and do not evaluate the performance 
of any pollution signature protocol. The reason is that our protocol is not tied to any particular such 
scheme and uses it modularly. To enforce that nodes code with coefficients of one (Section 4.4), the most 
important step for throughput, we invoke the pollution scheme no more than it is invoked without our 
diversity checks. To enforce our full protocol with pseudorandom coefficients, during verification, each 
node computes one additional homomorphic operation of the integrity signature (per parent for PIP and 
per challenge for Log-PIP), typically an exponentiation in a certain group: sig s (-EpJ Qi . Fortunately, the 
coding coefficients are typically relatively small, e.g., 64 bits (even though the integrity signature allows 
them to be as large as q as explained in Section 4.1). Note that the pollution signature verification, which 
is expensive, is not called additionally. 
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Table 1: Performance results of PIP and Log-PIP in milliseconds. The first 8 rows with values show results for PIP and 
Log-PIP when all coding coefficients are one (Section 4.4). The first column indicates the number of parents of a node. Each 
data cell in the rest of the columns consists of two values: transmission time and verification time. The last row shows the 
additional cost (only for verification) when adding pseudorandom coefficients (Section 4.5) due to the homomorphic operation 
of the validity signature. 

In Tabic 1, we present performance results of PIP and Log-PIP using one challenge. We consider an 
integrity signature of size 1024 bits and coding coefficients of size 64 bits. 

We can see that, for verification, as we increase the number of parents, the overhead of Log-PIP increases 
very slowly (logarithmically) as compared to the linear performance of PIP. The same happens to packet 
size, which we evaluate later in this section. Therefore, we recommend using Log-PIP for scenarios with 
more than three parents, and PIP for cases with at most three parents. Alternatively, one could select 
a hybrid algorithm by performing r > 1 challenges from Log-PIP. The performance of Log-PIP grows 
linearly in the number of challenges so one can tune the probability of detection (see Section 4) based on 
the desired tradeoff with performance overhead. 

We can see that the C/C++ protocols impose modest overhead. For 10 parents, which is a reasonably 
large value, the running time at a node to prepare for transmitting the data is ~ 0.25 ms and the time 
to verify a packet's diversity 1.4 ms in total for Log-PIP; for three parents, the time to verify diversity 
is 3.7 ms for PIP. All these values are independent of how large the packet payload is. Let's compare this 
to the cost of a pollution scheme, for example [BFKW09]. In this scheme, the verification consists of two 
bilinear map computations and m + n modular exponentiations, resulting in at least 100 ms run time for 
verification in C using the PBC library for bilinear maps for each parent. For three parents, the relative 
overhead of PIP is thus < 2% and of Log-PIP is < 0.5%. Due to this low additional overhead, we believe 
that if one is already using a pollution scheme, one might as well also use our scheme in addition to provide 
diversity. 

The Java and Android implementations are slower because of the language and/or device limitations 
of the Nexus One. Nevertheless, we believe these implementations still perform well when used for higher 
level applications like P2P content distribution. 

6.3 Packet Size 

For PIP, the packet size increase in PIP is \TZ^\ ■ (|<rsl +320) +320 bits and the sum of packet increase and 
information sent during challenge phase in Log-PIPis 480 + |cts| + 2|cts| log(|7?.|\i |) bits, where |7?.|\i| is the 
number of parents to code over. Recall that |ers| is the size of the validity signature, and depends on the 
validity scheme used. For instance, if [BFKW09] is used, we have an increase in PIP of 480- \1Zn \ +320 bits 
and in Log-PIP of 640 + 320 • log ( | |) bits. As discussed in Section 4, the packet size does not increase 
as the payload grows, so such overhead becomes insignificant when transmitting large files. 

7 Conclusions 

In this paper, we presented two novel protocols, PIP and Log-PIP, for detecting whether a node coded 
correctly over all the packets received according to a random linear network coding algorithm. No previous 
work defends against such diversity attacks by Byzantine nodes. Our evaluation shows that our protocols 
are efficient and the overhead of both of our protocols does not grow with the size of the packet payload. 
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